Implementing and Evaluating Warehouses and Summaries Over a Cluster

نویسنده

  • Pedro Furtado
چکیده

Cluster computation power provides a promising way to improve response time in large data warehouses. On the other hand, the use of sampling summaries on the cluster for approximate answering of OLAP queries provides a very flexible system that can provide response time guarantees. In this paper we explore the cluster computation paradigm for data warehouses and summaries. The use of cluster computation in a network with N computers can speedup query processing about N times and further speedup can be obtained using samples instead of the full data. Sampling summaries have been proposed before in the context of OLAP queries to avoid query processing times that leave users and applications waiting too long when only exploration analysis is required over more or less aggregated data. But while a typical one-node sampling summary is either too small to answer more detailed queries or too slow to provide almost instant response time, summaries over a cluster are extremely fast and are sufficiently large to answer most aggregation query patterns. We explore the implementation and processing of the data warehouse and sampling summaries over a set of nodes for cooperative cluster computing and present experimental results on the subject.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Specification-Based Data Reduction in Dimensional Data Warehouses

Many data warehouses contain massive amounts of data and grow rapidly. Examples include warehouses with retail sales data capturing customer behavior and warehouses with click-stream data capturing user behavior on web sites. The sheer size of these warehouses makes them increasingly hard to manage and query efficiently. As time passes, old, detailed data in the warehouses tend to become less i...

متن کامل

ارائه یک سیستم هوشمند و معناگرا برای ارزیابی سیستم های خلاصه ساز متون

Nowadays summarizers and machine translators have attracted much attention to themselves, and many activities on making such tools have been done around the world. For Farsi like the other languages there have been efforts in this field. So evaluating such tools has a great importance. Human evaluations of machine summarization are extensive but expensive. Human evaluations can take months to f...

متن کامل

An Analysis of User Strategies for Examining and Processing Ranked Lists of Documents

The predominant display of document retrieval results is a ranked list of query-biased summaries. When examining and processing search results, users must make complex decisions about how to allocate their time and relevance judging effort between evaluation of summaries and the full documents reachable with a mouse click on a summary. We performed a cluster analysis of the search results proce...

متن کامل

Generating Update Summaries for DUC 2007

Update summaries as defined for the new DUC 2007 task deliver focused information to a user who has already read a set of older documents covering the same topic. In this paper, we show how to generate this kind of summary from the same data structure—fuzzy coreference cluster graphs—as all other generic and focused multi-document summaries. Our system ERSS 2007 implementing this algorithm also...

متن کامل

Utility of Ranking Warehouse Candidates in Workshop Locations Using UTAStar

Although the importance of locating in manufacturing and service companies is not a new issue, one of significance applications is to determine the appropriate location for warehouses in manufacturing workshops warehouses to the maintenance of materials or products. In any organizations, Finding the suitable site for warehouses establishments to increase customer service and efficiency is one o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003